{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Set working directory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Python 3.8.2\r\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "cwd = os.path.split(os.getcwd())\n",
    "if cwd[-1] == 'tutorials':\n",
    "    os.chdir('..')\n",
    "!python --version"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Import modules"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from download_threat_information.download_threat_data import _download_attack, _download_capec, _download_cwe, _download_cve, main\n",
    "from download_threat_information.parsing_scripts.parse_attack_tactic_technique import link_tactic_techniques\n",
    "from download_threat_information.parsing_scripts.parse_cve import parse_cve_file\n",
    "from download_threat_information.parsing_scripts.parse_capec_cwe import parse_capec_cwe_files\n",
    "from utils.tutorial_util import print_files_in_folder"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Download threat data\n",
    "\n",
    "The threat data from MITRE and NIST need to be downloaded and parsed before building BRON. To download the threat data, run the following command:\n",
    "```\n",
    "python download_threat_information/download_threat_data.py --threat_data_type THREAT_DATA_TYPE (optional) --only_recent_cves (optional)\n",
    "```\n",
    "Not adding either of the optional arguments downloads data for all threat data types and all CVE data from 1999-2020 inclusive. To download threat data for only one threat data type, add the argument `--threat_data_type` and the name of the data type. `THREAT_DATA_TYPE` can be either 'ATTACK', 'CAPEC', 'CWE', or 'CVE'. To download only recent CVE data from 2015-2020 inclusive, add the argument `--only_recent_cves`.\n",
    "\n",
    "In this tutorial, we download threat data for all threat data types and all CVE data from 1999-2020. Note that the years start at 2002 because the CVE data from 1999-2002 are all contained in the CVE data from 2002."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "out_path = 'download_threat_information'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "cve_years = ['2002', '2003', '2004', '2005', '2006', '2007', '2008', '2009', '2010', '2011',\n",
    "             '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020']\n",
    "main(cve_years)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "'download_threat_information/download_threat_data.py 1608147872.5942457'\n",
      "'download_threat_information/__pycache__ 1608148090.9301898'\n",
      "'download_threat_information/parsing_scripts 1608148091.0808318'\n",
      "'download_threat_information/raw_enterprise_attack.json 1608148094.5972433'\n",
      "'download_threat_information/raw_CAPEC.json 1608148096.5065253'\n",
      "'download_threat_information/raw_CWE.zip 1608148097.0192606'\n",
      "'download_threat_information/raw_CVE_2002.json.gz 1608148097.4379542'\n",
      "'download_threat_information/raw_CVE_2003.json.gz 1608148098.5400357'\n",
      "'download_threat_information/raw_CVE_2004.json.gz 1608148099.0695996'\n",
      "'download_threat_information/raw_CVE_2005.json.gz 1608148099.8033426'\n",
      "'download_threat_information/raw_CVE_2006.json.gz 1608148100.785197'\n",
      "'download_threat_information/raw_CVE_2007.json.gz 1608148102.5427208'\n",
      "'download_threat_information/raw_CVE_2008.json.gz 1608148103.900211'\n",
      "'download_threat_information/raw_CVE_2009.json.gz 1608148105.474411'\n",
      "'download_threat_information/raw_CVE_2010.json.gz 1608148107.3424408'\n",
      "'download_threat_information/raw_CVE_2011.json.gz 1608148108.4307504'\n",
      "'download_threat_information/raw_CVE_2012.json.gz 1608148110.0601213'\n",
      "'download_threat_information/raw_CVE_2013.json.gz 1608148111.1695313'\n",
      "'download_threat_information/raw_CVE_2014.json.gz 1608148112.862786'\n",
      "'download_threat_information/raw_CVE_2015.json.gz 1608148114.820714'\n",
      "'download_threat_information/raw_CVE_2016.json.gz 1608148116.081137'\n",
      "'download_threat_information/raw_CVE_2017.json.gz 1608148118.8042517'\n",
      "'download_threat_information/raw_CVE_2018.json.gz 1608148122.6246011'\n",
      "'download_threat_information/raw_CVE_2019.json.gz 1608148125.8195086'\n",
      "'download_threat_information/raw_CVE_2020.json.gz 1608148130.4454238'\n",
      "'download_threat_information/raw_CVE.json.gz 1608148155.0599923'\n",
      "'download_threat_information/bron_meta_data.json 1608148155.1923862'\n"
     ]
    }
   ],
   "source": [
    "# Show saved files \n",
    "print_files_in_folder(out_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Parse ATT&CK Tactic and Technique data\n",
    "\n",
    "To parse the ATT&CK Tactic and Technique data, run the following command:\n",
    "```\n",
    "python download_threat_information/parsing_scripts/parse_attack_tactic_technique.py --filename FILENAME --save_path SAVE_PATH\n",
    "```\n",
    "`FILENAME` is the file path to `raw_enterprise_attack.json`, and `SAVE_PATH` is the folder path to save parsed threat data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "filename = os.path.join(out_path, 'raw_enterprise_attack.json')\n",
    "link_tactic_techniques(filename, out_path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "'download_threat_information/download_threat_data.py 1608147872.5942457'\n",
      "'download_threat_information/__pycache__ 1608148090.9301898'\n",
      "'download_threat_information/parsing_scripts 1608148091.0808318'\n",
      "'download_threat_information/raw_enterprise_attack.json 1608148094.5972433'\n",
      "'download_threat_information/raw_CAPEC.json 1608148096.5065253'\n",
      "'download_threat_information/raw_CWE.zip 1608148097.0192606'\n",
      "'download_threat_information/raw_CVE_2002.json.gz 1608148097.4379542'\n",
      "'download_threat_information/raw_CVE_2003.json.gz 1608148098.5400357'\n",
      "'download_threat_information/raw_CVE_2004.json.gz 1608148099.0695996'\n",
      "'download_threat_information/raw_CVE_2005.json.gz 1608148099.8033426'\n",
      "'download_threat_information/raw_CVE_2006.json.gz 1608148100.785197'\n",
      "'download_threat_information/raw_CVE_2007.json.gz 1608148102.5427208'\n",
      "'download_threat_information/raw_CVE_2008.json.gz 1608148103.900211'\n",
      "'download_threat_information/raw_CVE_2009.json.gz 1608148105.474411'\n",
      "'download_threat_information/raw_CVE_2010.json.gz 1608148107.3424408'\n",
      "'download_threat_information/raw_CVE_2011.json.gz 1608148108.4307504'\n",
      "'download_threat_information/raw_CVE_2012.json.gz 1608148110.0601213'\n",
      "'download_threat_information/raw_CVE_2013.json.gz 1608148111.1695313'\n",
      "'download_threat_information/raw_CVE_2014.json.gz 1608148112.862786'\n",
      "'download_threat_information/raw_CVE_2015.json.gz 1608148114.820714'\n",
      "'download_threat_information/raw_CVE_2016.json.gz 1608148116.081137'\n",
      "'download_threat_information/raw_CVE_2017.json.gz 1608148118.8042517'\n",
      "'download_threat_information/raw_CVE_2018.json.gz 1608148122.6246011'\n",
      "'download_threat_information/raw_CVE_2019.json.gz 1608148125.8195086'\n",
      "'download_threat_information/raw_CVE_2020.json.gz 1608148130.4454238'\n",
      "'download_threat_information/raw_CVE.json.gz 1608148155.0599923'\n",
      "'download_threat_information/bron_meta_data.json 1608148155.1923862'\n",
      "'download_threat_information/capec_technique_map.json 1608148156.8372931'\n",
      "'download_threat_information/technique_tactic_map.json 1608148156.8407865'\n",
      "'download_threat_information/technique_name_map.json 1608148156.8418293'\n"
     ]
    }
   ],
   "source": [
    "# Show saved files \n",
    "print_files_in_folder(out_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Parse Vulnerability (CVE) data\n",
    "\n",
    "To parse the Vulnerability data, run the following command:\n",
    "```\n",
    "python download_threat_information/parsing_scripts/parse_cve.py --cve_path CVE_PATH --save_path SAVE_PATH --only_recent_cves (optional)\n",
    "```\n",
    "`CVE_PATH` is the file path to `raw_CVE.json.gz`, and `SAVE_PATH` is the folder path to save parsed threat data. If the CVE data use only recent CVEs from 2015-2020, then add the argument `--only_recent_cves`. The folder to save parsed Vulnerability data should be the same folder that contains parsed Tactic and Technique data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "cve_path = os.path.join(out_path, 'raw_CVE.json.gz')\n",
    "only_recent_cves = False\n",
    "if only_recent_cves:\n",
    "    save_path_file = \"cve_map_cpe_cwe_score_2015_2020.json\"\n",
    "else:\n",
    "    save_path_file = \"cve_map_cpe_cwe_score.json\"\n",
    "save_file = os.path.join(out_path, save_path_file)\n",
    "parse_cve_file(cve_path, save_file)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "'download_threat_information/download_threat_data.py 1608147872.5942457'\n",
      "'download_threat_information/__pycache__ 1608148090.9301898'\n",
      "'download_threat_information/parsing_scripts 1608148091.0808318'\n",
      "'download_threat_information/raw_enterprise_attack.json 1608148094.5972433'\n",
      "'download_threat_information/raw_CAPEC.json 1608148096.5065253'\n",
      "'download_threat_information/raw_CWE.zip 1608148097.0192606'\n",
      "'download_threat_information/raw_CVE_2002.json.gz 1608148097.4379542'\n",
      "'download_threat_information/raw_CVE_2003.json.gz 1608148098.5400357'\n",
      "'download_threat_information/raw_CVE_2004.json.gz 1608148099.0695996'\n",
      "'download_threat_information/raw_CVE_2005.json.gz 1608148099.8033426'\n",
      "'download_threat_information/raw_CVE_2006.json.gz 1608148100.785197'\n",
      "'download_threat_information/raw_CVE_2007.json.gz 1608148102.5427208'\n",
      "'download_threat_information/raw_CVE_2008.json.gz 1608148103.900211'\n",
      "'download_threat_information/raw_CVE_2009.json.gz 1608148105.474411'\n",
      "'download_threat_information/raw_CVE_2010.json.gz 1608148107.3424408'\n",
      "'download_threat_information/raw_CVE_2011.json.gz 1608148108.4307504'\n",
      "'download_threat_information/raw_CVE_2012.json.gz 1608148110.0601213'\n",
      "'download_threat_information/raw_CVE_2013.json.gz 1608148111.1695313'\n",
      "'download_threat_information/raw_CVE_2014.json.gz 1608148112.862786'\n",
      "'download_threat_information/raw_CVE_2015.json.gz 1608148114.820714'\n",
      "'download_threat_information/raw_CVE_2016.json.gz 1608148116.081137'\n",
      "'download_threat_information/raw_CVE_2017.json.gz 1608148118.8042517'\n",
      "'download_threat_information/raw_CVE_2018.json.gz 1608148122.6246011'\n",
      "'download_threat_information/raw_CVE_2019.json.gz 1608148125.8195086'\n",
      "'download_threat_information/raw_CVE_2020.json.gz 1608148130.4454238'\n",
      "'download_threat_information/raw_CVE.json.gz 1608148155.0599923'\n",
      "'download_threat_information/bron_meta_data.json 1608148155.1923862'\n",
      "'download_threat_information/capec_technique_map.json 1608148156.8372931'\n",
      "'download_threat_information/technique_tactic_map.json 1608148156.8407865'\n",
      "'download_threat_information/technique_name_map.json 1608148156.8418293'\n",
      "'download_threat_information/cve_map_cpe_cwe_score.json 1608148186.2785726'\n"
     ]
    }
   ],
   "source": [
    "print_files_in_folder(out_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Parse Attack Pattern (CAPEC) and Weakness (CWE) data\n",
    "\n",
    "To parse the Attack Pattern and Weakness data, run the following command:\n",
    "```\n",
    "python download_threat_information/parsing_scripts/parse_capec_cwe.py --capec_file CAPEC_FILE --cwe_file CWE_FILE --save_path SAVE_PATH\n",
    "```\n",
    "`CAPEC_FILE` is the file path to `raw_CAPEC.json`, `CWE_FILE` is the file path to `raw_CWE.zip`, and `SAVE_PATH` is the folder path to save parsed threat data. The folder to save parsed Attack Pattern and Weakness data should be the same folder that contains parsed Tactic and Technique data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "capec_file = os.path.join(out_path, 'raw_CAPEC.json')\n",
    "cwe_file = os.path.join(out_path, 'raw_CWE.zip')\n",
    "parse_capec_cwe_files(capec_file, cwe_file, save_path=out_path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "'download_threat_information/download_threat_data.py 1608147872.5942457'\n",
      "'download_threat_information/__pycache__ 1608148090.9301898'\n",
      "'download_threat_information/parsing_scripts 1608148091.0808318'\n",
      "'download_threat_information/raw_enterprise_attack.json 1608148094.5972433'\n",
      "'download_threat_information/raw_CAPEC.json 1608148096.5065253'\n",
      "'download_threat_information/raw_CWE.zip 1608148097.0192606'\n",
      "'download_threat_information/raw_CVE_2002.json.gz 1608148097.4379542'\n",
      "'download_threat_information/raw_CVE_2003.json.gz 1608148098.5400357'\n",
      "'download_threat_information/raw_CVE_2004.json.gz 1608148099.0695996'\n",
      "'download_threat_information/raw_CVE_2005.json.gz 1608148099.8033426'\n",
      "'download_threat_information/raw_CVE_2006.json.gz 1608148100.785197'\n",
      "'download_threat_information/raw_CVE_2007.json.gz 1608148102.5427208'\n",
      "'download_threat_information/raw_CVE_2008.json.gz 1608148103.900211'\n",
      "'download_threat_information/raw_CVE_2009.json.gz 1608148105.474411'\n",
      "'download_threat_information/raw_CVE_2010.json.gz 1608148107.3424408'\n",
      "'download_threat_information/raw_CVE_2011.json.gz 1608148108.4307504'\n",
      "'download_threat_information/raw_CVE_2012.json.gz 1608148110.0601213'\n",
      "'download_threat_information/raw_CVE_2013.json.gz 1608148111.1695313'\n",
      "'download_threat_information/raw_CVE_2014.json.gz 1608148112.862786'\n",
      "'download_threat_information/raw_CVE_2015.json.gz 1608148114.820714'\n",
      "'download_threat_information/raw_CVE_2016.json.gz 1608148116.081137'\n",
      "'download_threat_information/raw_CVE_2017.json.gz 1608148118.8042517'\n",
      "'download_threat_information/raw_CVE_2018.json.gz 1608148122.6246011'\n",
      "'download_threat_information/raw_CVE_2019.json.gz 1608148125.8195086'\n",
      "'download_threat_information/raw_CVE_2020.json.gz 1608148130.4454238'\n",
      "'download_threat_information/raw_CVE.json.gz 1608148155.0599923'\n",
      "'download_threat_information/bron_meta_data.json 1608148155.1923862'\n",
      "'download_threat_information/capec_technique_map.json 1608148156.8372931'\n",
      "'download_threat_information/technique_tactic_map.json 1608148156.8407865'\n",
      "'download_threat_information/technique_name_map.json 1608148156.8418293'\n",
      "'download_threat_information/cve_map_cpe_cwe_score.json 1608148186.2785726'\n",
      "'download_threat_information/capec_names.json 1608148188.0269263'\n",
      "'download_threat_information/cwe_names.json 1608148188.0296562'\n",
      "'download_threat_information/capec_cwe_mapping.json 1608148188.0345268'\n"
     ]
    }
   ],
   "source": [
    "print_files_in_folder(out_path)"
   ]
  }
 ],
 "metadata": {
  "@webio": {
   "lastCommId": null,
   "lastKernelId": null
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}